Warning: file_put_contents(aCache/aDaily/post/opendatascience/-2330-2331-): Failed to open stream: No space left on device in /var/www/tg-me/post.php on line 50
Data Science by ODS.ai 🦜 | Telegram Webview: opendatascience/2331 -
Telegram Group & Telegram Channel
βš™οΈ SWE-rebench: Nebius AI R&D team presents new dataset for SWE tasks.

Researchers built an automated system to collect and validate thousands of real-world tasks from GitHub, designed for training and evaluation of LLMs in software engineering.

Main features of the system:
1️⃣ Automatic data collection: Continuously extracts issue-PR pairs from Python repositories.
2️⃣ LLM-based environment setup: LLM analyzes repositories, creates install instructions, and updates them if errors happen.
3️⃣ Execution-based validation: Each task is tested by automatic setup, test run, and dependency freezing to make it reproducible.
4️⃣ LLM quality annotation: Tasks are labeled for clarity, difficulty, and test correctness to support filtering.

Result:
SWE-rebench dataset: 21,000+ ready-to-use interactive tasks.
Continuous updates: Fresh data is added regularly.
Transparent evaluation: Tasks are used for public SWE-rebench leaderboard.

πŸš€ SWE-rebench gives researchers and developers real and validated tasks to work with LLMs in SWE field.

Technical report: arXiv
Dataset: SWE-rebench



tg-me.com/opendatascience/2331
Create:
Last Update:

βš™οΈ SWE-rebench: Nebius AI R&D team presents new dataset for SWE tasks.

Researchers built an automated system to collect and validate thousands of real-world tasks from GitHub, designed for training and evaluation of LLMs in software engineering.

Main features of the system:
1️⃣ Automatic data collection: Continuously extracts issue-PR pairs from Python repositories.
2️⃣ LLM-based environment setup: LLM analyzes repositories, creates install instructions, and updates them if errors happen.
3️⃣ Execution-based validation: Each task is tested by automatic setup, test run, and dependency freezing to make it reproducible.
4️⃣ LLM quality annotation: Tasks are labeled for clarity, difficulty, and test correctness to support filtering.

Result:
SWE-rebench dataset: 21,000+ ready-to-use interactive tasks.
Continuous updates: Fresh data is added regularly.
Transparent evaluation: Tasks are used for public SWE-rebench leaderboard.

πŸš€ SWE-rebench gives researchers and developers real and validated tasks to work with LLMs in SWE field.

Technical report: arXiv
Dataset: SWE-rebench

BY Data Science by ODS.ai 🦜





Share with your friend now:
tg-me.com/opendatascience/2331

View MORE
Open in Telegram


Data Science by ODS ai 🦜 Telegram | DID YOU KNOW?

Date: |

How to Use Bitcoin?

n the U.S. people generally use Bitcoin as an alternative investment, helping diversify a portfolio apart from stocks and bonds. You can also use Bitcoin to make purchases, but the number of vendors that accept the cryptocurrency is still limited. Big companies that accept Bitcoin include Overstock, AT&T and Twitch. You may also find that some small local retailers or certain websites take Bitcoin, but you’ll have to do some digging. That said, PayPal has announced that it will enable cryptocurrency as a funding source for purchases this year, financing purchases by automatically converting crypto holdings to fiat currency for users. β€œThey have 346 million users and they’re connected to 26 million merchants,” says Spencer Montgomery, founder of Uinta Crypto Consulting. β€œIt’s huge.”

Should I buy bitcoin?

β€œTo the extent it is used I fear it’s often for illicit finance. It’s an extremely inefficient way of conducting transactions, and the amount of energy that’s consumed in processing those transactions is staggering,” the former Fed chairwoman said. Yellen’s comments have been cited as a reason for bitcoin’s recent losses. However, Yellen’s assessment of bitcoin as a inefficient medium of exchange is an important point and one that has already been raised in the past by bitcoin bulls. Using a volatile asset in exchange for goods and services makes little sense if the asset can tumble 10% in a day, or surge 80% over the course of a two months as bitcoin has done in 2021, critics argue. To put a finer point on it, over the past 12 months bitcoin has registered 8 corrections, defined as a decline from a recent peak of at least 10% but not more than 20%, and two bear markets, which are defined as falls of 20% or more, according to Dow Jones Market Data.

Data Science by ODS ai 🦜 from br


Telegram Data Science by ODS.ai 🦜
FROM USA